Bonferroni Correction
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the Bonferroni correction is a method to counteract the
multiple comparisons problem In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. The more inferences ...
.


Background

The method is named for its use of the
Bonferroni inequalities In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individu ...
. An extension of the method to
confidence intervals In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
was proposed by
Olive Jean Dunn Olive Jean Dunn (1 September 1915 – 12 January 2008) was an American mathematician and statistician, and professor of biostatistics at the University of California Los Angeles (UCLA). She described methods for computing confidence intervals an ...
.
Statistical hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
is based on rejecting the
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
if the likelihood of the observed data under the null hypotheses is low. If multiple hypotheses are tested, the probability of observing a rare event increases, and therefore, the likelihood of incorrectly rejecting a null hypothesis (i.e., making a
Type I error In statistical hypothesis testing, a type I error is the mistaken rejection of an actually true null hypothesis (also known as a "false positive" finding or conclusion; example: "an innocent person is convicted"), while a type II error is the fa ...
) increases. The Bonferroni correction compensates for that increase by testing each individual hypothesis at a significance level of \alpha/m, where \alpha is the desired overall alpha level and m is the number of hypotheses. For example, if a trial is testing m = 20 hypotheses with a desired \alpha = 0.05, then the Bonferroni correction would test each individual hypothesis at \alpha = 0.05/20 = 0.0025. Likewise, when constructing multiple confidence intervals the same phenomenon appears.


Definition

Let H_1,\ldots,H_m be a family of hypotheses and p_1,\ldots,p_m their corresponding
p-value In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
s. Let m be the total number of null hypotheses, and let m_0 be the number of true null hypotheses (which is presumably unknown to the researcher). The
family-wise error rate In statistics, family-wise error rate (FWER) is the probability of making one or more false discoveries, or type I errors when performing multiple hypotheses tests. Familywise and Experimentwise Error Rates Tukey (1953) developed the concept of a ...
(FWER) is the probability of rejecting at least one true H_, that is, of making at least one
type I error In statistical hypothesis testing, a type I error is the mistaken rejection of an actually true null hypothesis (also known as a "false positive" finding or conclusion; example: "an innocent person is convicted"), while a type II error is the fa ...
. The Bonferroni correction rejects the null hypothesis for each p_i\leq\frac \alpha m, thereby controlling the FWER at \leq \alpha. Proof of this control follows from
Boole's inequality In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individu ...
, as follows: : \text = P\left\ \leq\sum_^\left\ = m_0 \frac \alpha m \leq \alpha. This control does not require any assumptions about dependence among the p-values or about how many of the null hypotheses are true.


Extensions


Generalization

Rather than testing each hypothesis at the \alpha/m level, the hypotheses may be tested at any other combination of levels that add up to \alpha, provided that the level of each test is decided before looking at the data. For example, for two hypothesis tests, an overall \alpha of 0.05 could be maintained by conducting one test at 0.04 and the other at 0.01.


Confidence intervals

The procedure proposed by Dunn can be used to adjust
confidence intervals In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
. If one establishes m confidence intervals, and wishes to have an overall confidence level of 1-\alpha, each individual confidence interval can be adjusted to the level of 1-\frac.


Continuous problems

When searching for a signal in a continuous parameter space there can also be a problem of multiple comparisons, or look-elsewhere effect. For example, a physicist might be looking to discover a particle of unknown mass by considering a large range of masses; this was the case during the Nobel Prize winning detection of the
Higgs boson The Higgs boson, sometimes called the Higgs particle, is an elementary particle in the Standard Model of particle physics produced by the quantum excitation of the Higgs field, one of the fields in particle physics theory. In the Stand ...
. In such cases, one can apply a continuous generalization of the Bonferroni correction by employing
Bayesian Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister. Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...
logic to relate the effective number of trials, m, to the prior-to-posterior volume ratio.


Alternatives

There are alternative ways to control the
family-wise error rate In statistics, family-wise error rate (FWER) is the probability of making one or more false discoveries, or type I errors when performing multiple hypotheses tests. Familywise and Experimentwise Error Rates Tukey (1953) developed the concept of a ...
. For example, the
Holm–Bonferroni method In statistics, the Holm–Bonferroni method, also called the Holm method or Bonferroni–Holm method, is used to counteract the problem of multiple comparisons. It is intended to control the family-wise error rate (FWER) and offers a simple test ...
and the
Šidák correction In statistics, the Šidák correction, or Dunn–Šidák correction, is a method used to counteract the problem of multiple comparisons. It is a simple method to control the familywise error rate. When all null hypotheses are true, the method pro ...
are universally more powerful procedures than the Bonferroni correction, meaning that they are always at least as powerful. Unlike the Bonferroni procedure, these methods do not control the
expected number In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
of Type I errors per family (the per-family Type I error rate).


Criticism

With respect to FWER control, the Bonferroni correction can be conservative if there are a large number of tests and/or the test statistics are positively correlated. The correction comes at the cost of increasing the probability of producing
false negatives A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
, i.e., reducing
statistical power In statistics, the power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true. It is commonly denoted by 1-\beta, and represents the chances ...
. There is not a definitive consensus on how to define a family in all cases, and adjusted test results may vary depending on the number of tests included in the
family Family (from la, familia) is a Social group, group of people related either by consanguinity (by recognized birth) or Affinity (law), affinity (by marriage or other relationship). The purpose of the family is to maintain the well-being of its ...
of hypotheses. Such criticisms apply to FWER control in general, and are not specific to the Bonferroni correction.


References


External links


Bonferroni, Sidak online calculator
{{DEFAULTSORT:Bonferroni Correction Multiple comparisons Statistical hypothesis testing